Feature Noising for Log-Linear Structured Prediction

نویسندگان

Sida I. Wang

Mengqiu Wang

Stefan Wager

Percy Liang

Christopher D. Manning

چکیده

NLP models have many and sparse features, and regularization is key for balancing model overfitting versus underfitting. A recently repopularized form of regularization is to generate fake training data by repeatedly adding noise to real data. We reinterpret this noising as an explicit regularizer, and approximate it with a second-order formula that can be used during training without actually generating fake data. We show how to apply this method to structured prediction using multinomial logistic regression and linear-chain CRFs. We tackle the key challenge of developing a dynamic program to compute the gradient of the regularizer efficiently. The regularizer is a sum over inputs, so we can estimate it more accurately via a semi-supervised or transductive extension. Applied to text classification and NER, our method provides a >1% absolute performance gain over use of standard L2 regularization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structured Prediction with Output Embeddings for Semantic Image Annotation

We address the task of annotating images with semantic tuples. Solving this problem requires an algorithm able to deal with hundreds of classes for each argument of the tuple. In such contexts, data sparsity becomes a key challenge. We propose handling this sparsity by incorporating feature representations of both the inputs (images) and outputs (argument classes) into a factorized log-linear m...

متن کامل

Taming Structured Perceptrons on Wild Feature Vectors

Structured perceptrons are attractive due to their simplicity and speed, and have been used successfully for tuning the weights of binary features in a machine translation system. In attempting to apply them to tuning the weights of real-valued features with highly skewed distributions, we found that they did not work well. This paper describes a modification to the update step and compares the...

متن کامل

A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing

Neural probabilistic parsers are attractive for their capability of automatic feature combination and small data sizes. A transition-based greedy neural parser has given better accuracies over its linear counterpart. We propose a neural probabilistic structured-prediction model for transition-based dependency parsing, which integrates search and learning. Beam search is used for decoding, and c...

متن کامل

Softmax-Margin Training for Structured Log-Linear Models

We describe a method of incorporating task-specific cost functions into standard conditional log-likelihood (CLL) training of linear structured prediction models. Recently introduced in the speech recognition community, we describe the method generally for structured models, highlight connections to CLL and maxmargin learning for structured prediction (Taskar et al., 2003), and show that the me...

متن کامل

Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions

We describe a method of incorporating taskspecific cost functions into standard conditional log-likelihood (CLL) training of linear structured prediction models. Recently introduced in the speech recognition community, we describe the method generally for structured models, highlight connections to CLL and max-margin learning for structured prediction (Taskar et al., 2003), and show that the me...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Feature Noising for Log-Linear Structured Prediction

نویسندگان

چکیده

منابع مشابه

Structured Prediction with Output Embeddings for Semantic Image Annotation

Taming Structured Perceptrons on Wild Feature Vectors

A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing

Softmax-Margin Training for Structured Log-Linear Models

Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions

عنوان ژورنال:

اشتراک گذاری